September 12th, 2019 - New York

Automated dashboards for
1 billion users

Introducing the use case

  • A large international publisher would like to keep track on what articles have been read in the last month and distinguish between good and bad “evergreen” and “news” stories.
  • They would like an interactive dashboard but have no Shiny server
  • They would like an API that returns content velocity data for integration with other languages

Choice of tech

  • ga_model to package up GA data modelling and presentation
  • plotly for interactive plots, deployed via HTTP GET
  • plumber deployed on Cloud Run for scale
  • Docker to make environments
  • Cloud Build to create docker images

Creating article velocity R script

Thanks to Tim Wilson’s code on dartistics.com

library(googleAnalyticsR)
# a gar_model
ga_time_normalised <- function(viewID, interactive_plot=TRUE){
  model <- ga_model_load("time-normalised.gamr")
  ga_model(viewID, model,
           first_day_pageviews_min = 2,
           total_unique_pageviews_cutoff = 500,
           days_live_range = 60,
           page_filter_regex = ".*",
           interactive_plot = interactive_plot)
}

Creating article velocity R script (output)

plumber

plumber APIs

https://www.rplumber.io/

Make an API out of your script:

#' @get /hello
#' @html
function(){
  "<html><h1>hello world</h1></html>"
}

Adapt plumber API for the model

library(googleAnalyticsR)

#' Return output data from the ga_time_normalised ga_model
#' @param viewID The viewID for Google Analytics
#' @get /data
function(viewID=""){
  model <- ga_time_normalised(viewID)
  model$output
}

#' Plot out data from the ga_time_normalised ga_model
#' @param viewID The viewID for Google Analytics
#' @get /plot
#' @serializer htmlwidget
function(viewID=""){
  model <- ga_time_normalised(viewID)
  model$plot
}

Adapt plumber API for the model - test in local R session

Deployment on Cloud Run

Cloud Run deployment

Cloud Run Docker file

Based on:

FROM trestletech/plumber
LABEL maintainer="mark"

COPY [".", "./"]

ENTRYPOINT ["R", "-e", 
            "pr <- plumber::plumb(commandArgs()[4]); 
            pr$run(host='0.0.0.0', 
                   port=as.numeric(Sys.getenv('PORT')))"]
CMD ["api.R"]

Cloud Run Docker file - autogenerated

library(containerit)
dd <- dockerfile("api.R")
write(dd, "Dockerfile")

And add any packages needed by model.

Cloud Run deployment - server-side auth

  • Server-side - JSON credentials file for GA account in api.R
library(googleCloudStorageR)
if(!is.null(googleAuthR::gar_gce_auth())){
  gcs_get_object("ga-auth.json", bucket="your-bucket",
                 saveToDisk = "ga-auth.json", overwrite = TRUE)
}

library(googleAnalyticsR)
#..do calls..

Cloud Run deployment - client-side auth

  • Client-side - use Cloud Run’s authenticated calls to restrict API calls

Continuous Development with Cloud Build

Set up a build trigger for the GitHub repo you commit the Dockerfile to:

Cloud Build successful

Deploy to Cloud Run

Deployed on Cloud Run

Thank you!